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Abstract. We present a novel and simple method to numerically calculate Fisher Infor- 
mation Matrices for stochastic chemical kinetics models. The linear noise approximation 
is used to derive model equations and a likelihood function which leads to an efficient 
I I computational algorithm. Our approach reduces the problem of calculating the Fisher 

Information Matrix to solving a set of ordinary differential equations. This is the first 
method to compute Fisher Information for stochastic chemical kinetics models without 
I t the need for Monte Carlo simulations. This methodology is then used to study sensitiv- 

es ity, robustness and parameter identifiability in stochastic chemical kinetics models. We 

show that significant differences exist between stochastic and deterministic models as well 
' ' as between stochastic models with time-series and time-point measurements. We demon- 

strate that these discrepancies arise from the variability in molecule numbers, correlations 
^ between species, and temporal correlations and show how this approach can be used in the 

analysis and design of experiments probing stochastic processes at the cellular level. The 
— algorithm has been implemented as a Matlab package and is available from the authors 

upon request. 

^' 

Understanding the design principles underlying complex biochemical networks cannot 
^—i be grasped by intuition alone \1\ . Their complexity implies the need to build mathematical 

models and tools for their analysis. One of the powerful tools to elucidate such systems' 
• j-j performances is sensitivity analysis [2] . Large sensitivity to a parameter suggests that the 

rS system's output can change substantially with small variation in a parameter. Similarly 

large changes in an insensitive parameter will have little effect on the behaviour. Tradi- 
tionally, the concept of sensitivity has been applied to continuous deterministic systems 
described by differential equations in order to identify which parameters a given output of 
the system is most sensitive to; here sensitivities are computed via the integration of the 
linearisation of the model parameters [2] . 

In modelling biological processes, however, recent years have have witnessed rapidly in- 
creasing interest in stochastic models [3], as experimental and theoretical investigations 
have demonstrated the relevance of stochastic effects in chemical networks [U [5] . Although 
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stochastic models of biological processes are now routinely being applied to study bio- 
chemical phenomena ranging from metabolic networks to signal transduction pathways [6] , 
tools for their analysis are in their infancy compared to the deterministic framework. In 
particular, sensitivity analysis in a stochastic setting is usually, if at all, performed by 
analysis of a system's mean behaviour or using computationally intensive Monte Carlo 
simulations to approximate finite differences of a system's output or the Fisher information 
matrix with associated sensitivity measures [TJ [8] . The Fisher information has a prominent 
role in statistics and information theory: it is defined as the variance of the score and 
therefore allows us to measure how reliably inferences are. Geometrically, it corresponds 
to the curvature around the maximum value of the log-likelihood. 

The interest in characterising the parametric sensitivity of the dynamics of biochemical 
network models has two important reasons. First, sensitivity is instrumental for deducing 
system properties, such as robustness (understood as stability of behaviour under simul- 
taneous changes in model parameters) [9]. The concept of robustness is of significance, 
in turn, as it is related to many biological phenomena such as canalisation, homeostasis, 
stability, redundancy, and plasticity [lOj. Robustness is also relevant for characterising the 
dependence between parameter values and system behaviour. For instance, it has recently 
been reported that a large fraction of the parameters characterising a dynamical system 
are insensitive and can be varied over orders of magnitude without significant effect on 
system dynamics [HI [121 E] • 

Second, methods for optimal experimental design use sensitivity analysis to define the con- 
ditions under which an experiment is to be conducted in order to maximise the information 
content of the data |14j . Similarly, identifiability analysis uses the concept of sensitivity to 
determine a priori whether certain parameters can be estimated from experimental data 
of a given type [15] . 

We use the linear noise approximation (LNA) as a continuous approximation to Markov 
jump processes defined by the Chemical Master Equation (CME). This approximation has 
previously been used successfully for modelling as well as for inference [HI ITTl |19] . Ap- 
plying the LNA allows us to represent the Fisher Information matrix (FIM) as a solution 
of a set of ordinary differential equations (ODEs). We use this framework to investigate 
model robustness, study the information content of experimental samples and calculate 
Cramer-Rao (CR) bounds for model parameters. Analysis is performed for time series 
(TS) and time point (TP) data as well as for a corresponding deterministic (DT) model. 
Results are compared with each other and provide novel insights into the consequences 
of stochasticity in biochemical systems. Two biological examples are used to demonstrate 
our approach and its usefulness: a simple model of gene expression and a model of the 
p53 system. We show that substantial differences in the structure of FIMs exist between 
stochastic and deterministic versions of these models. Moreover, discrepancies appear also 
between stochastic models with different data types (TS, TP), and these can have signifi- 
cant impact on sensitivity, robustness and parameter identifiability. We demonstrate that 
differences arise from general variability in the number of molecules, correlation between 
them and temporal correlations. 
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1. Chemical kinetics models 



We consider a general system of N chemical species inside a fixed volume and let 
X = (xi, . . . , xtv)-^ denote the number of molecules. The stoichiometric matrix S = 
{sij}i=i^2...N; j=i,2...R describes changes in the population sizes due to R different chemical 
events, where each Sij describes the change in the number of molecules of type i from Xi 
to Xi + Sij caused by an event of type j. The probability that an event of type j occurs in 
time interval [t,t + dt) equals fj{x,Q,t)dt. The functions fj{x,Q,t) are called transition 
rates and G = {9i,...,9l) is a vector of model parameters. This specification leads to a 
Poisson birth and death process with transition densities described by the CME (see Sup- 
plementary Information (SI)). Unfortunately, the CME is not easy to analyze and hence 
various approximations have been developed. As shown in |16L \T7\ [TH] the linear noise 
approximation provides a useful and reliable framework for both modelling and statisti- 
cal inference. It is valid for systems with large number of reacting molecules and is an 
analogy of the Central Limit Theorem for Markov jump processes defined by CME |20j. 
Biochemical reactions are modelled through a stochastic dynamic model which essentially 
approximates a Poisson process by an ODE model with an appropriately defined noise 
process. Within the LNA a kinetic model is written as 



(1) x{t) = ip{t)+m 

(2) ^ = SF{ip,Q,t) 

(3) dc = A{ip,e,t)^ + E{ip,e,t)dw, 

where 

(4) F{^,e,t) = (/i(v9,e,t),..., /;((/., G,t)) 

(5) {^(v',e,t)},, = X:«..|| 

(6) Eiip,e,t) = S^diagiF{^,e,t)). 



Equation ([T]) divides the system's state into a macroscopic state, ip{t) = (0i(t), (/)7v(t)), 
and random fluctuations, ^(t). The macroscopic state is described by an ODE ([2]), the 
macroscopic rate equation (MRE), which in general needs to be solved numerically. Sto- 
chastic fluctuations are governed by a Wiener process (dW) driven linear stochastic 
differential equation ([s]) with an explicit solution readily available (see SI). The variance 
V{t) of the system's state x can be explicitly written in terms of an ODE 

(7) = Ai^, e, t)vit) + v{t)A{^, e, tf + E{^, e, t)E{^, o, tf, 

which is equivalent to the fluctuation-dissipation theorem. Similarly, temporal covariances 
are given by 



(8) 



coM{x{s),x{t)) = V{s)^{s, ty for t > s, 
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where $(s,t) is the fundamental matrix of the non-autonomous system of ODEs 



(9) 



dt 



A{ip,e,t)<^{s,t), ^s,s 



) = I. 



Equations ([T]j9]) are used to derive the hkelihood of experimental data. To account for 
different experimental settings we consider three types of data: time series (TS), time-point 
(TP) and deterministic (DT). For TS measurements are taken from a single trajectory 
(following the same cell) and therefore are statistically dependent; in practise TS data 
are usually obtained using fluorescent microscopy. TP measurements at each time point 
are taken from different trajectories (end time points of trajectories following different 
cells) and are thus independent. These data reflect experimental setups where the sample 
is sacrificed and the sequence of measurements is not strictly associated with the same 
sample path (e.g. flow-cytometry, Q-PCR). DT data are defined as a solution of MRE ^ 
with normally distributed measurement error with zero mean and variance o"^ and refer to 
measurements averaged over population of cells. 

Suppose measurements are collected at times ti, For simplicity we consider the case 

where at each time point ti all components of Xj are measured. In the SI we demonstrate 
that the same analysis can be done for a model with unobserved variables at no extra cost 
other than more complex notation. First let xg = {xt^, ■ ■ ■ ,xt^) be an nN column vector 
that contains all measurements of type Q, where Q G {TP, TS, DT}. It can be shown (see 
SI) that 



(10) 



XQ~MVN(/x(e),SQ(e)) 



where MVN denotes the multivariate normal distribution, 



(11) 



/x(e) = (<^(ti),...,(^(t„)) 



and (f{t) is a solution of the MRE ^ such that (^(0) = ipo and Eg is a (nN) x (nN) 
symmetric block matrix Eq(0) = {Sg(0)'^*'-'^}^_^ n-'-i n ^^^1^ that 



(12) 



SQ(e)(*'^-) 



= < 



for i = j Q G {TS, TP] 

for f = j Q G {DT] 
for i<j Q e {TP, DT] 
for i<jQ£ {TS] 







and V{t) is a solution of eq. ^ for a given initial condition ^(0) = Vq. The MVN 
likelihood is a result of our LNA and is analogous to the Central Limit Theorem for the 
CME. It is valid under the assumption of large number of molecules reacting in the system 



2. Fisher information matrix 



To calculate the FIM for the model (1 — 3), first, suppose that a random variable X 
has an A^-variate normal distribution with density mean = (/ii(0), /i7v(0))"^ 

and covariance matrix S(0). The FIM is then defined [2Tj as I(G)= {I{Q)k,i}i^i^i i, 
where 



® iogW(x,e))) f4-'og«'(-^.e)) 



(13) /(e),, = Ee yge,-^^^^--'j\eo, 

Then I{Q)ij can be expressed as [22] 

The above formula shows that, in order to calculate FIM for a multivariate normal distri- 
bution, it is enough to calculate the covariance matrix parameter derivatives of mean 
and parameter derivatives of the covariance matrix 



In the LNA equations (11) and (12) describe mean and variance, respectively, of experi- 
mental measurements, xq. The mean is given as the solution of an ODE, and the variance 
is either given as a product of solutions of ODEs (TS), directly as a solution of an ODE 
(TP), or is simply constant (DT). Hence, in order to calculate the FIM we calculate the 
derivatives of the solutions of an ODE with respect to the parameters [23J. For illustration, 
consider an dimensional ODE 

(15) z = v{z,e,t), 



where ^ is a scalar parameter. Denote by z{zo, 6, t) the solution of equation ( 15 ) with initial 
condition zq and let C,{t,0) = It can be shown that C, satisfies |23| 

(16) c = J{m,o.t)c + ^^v{z,e,t), 

where J{z{t),Q,t) is the Jacobian ^v{z,6,t). We can thus calculate derivatives ^ 

and ^^gg^^^ that give ^ and ^ needed to compute FIM for the model (1-3) (see SI). 
The FIM is of special significance for model analysis as it constitutes a tool for sensitivity 
analysis, robustness, identifiability and optimal experimental design as we will show below. 

2.1. The FIM and sensitivity. The classical sensitivity coefficient for an observable Q 
and parameter 6 is 

^~ d9- 

The behaviour of a stochastic system is defined by observables that are drawn from a 
probability distribution. The FIM is a measure of how this distribution changes in response 



-'^In the paper we are interested in the expected FT that under standard regularity conditions is equivalent 
to the expected Hessian of the likelihood. The expected FI is different from observed FI defined as Hessian 
of the likelihood of given data. 
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to infinitesimal changes in parameters. Suppose that i{Q;X) = log{'ip{X^Q)) and 
-E[^{Q-X)]. Than, 

i.e. the FIM is the expected Hessian of ^(0, X). Therefore, if 0* is the maximum likehhood 
estimate of a parameter there is a L x L orthogonal matrix C such that, in the new 
parameters 9' = C (0-0*), 



(18) ^(0)^^(0*)-^X;^*^^' 

i=l 

for near 0* . From this it follows that the Aj are the eigenvalues of the FIM and that the 
matrix C diagonalises it. If we assume that the Aj are ordered so that Ai > • • • > A^ then 
it follows that around the maximum the likelihood is most sensitive when 0^ is varied and 
least sensitive when 0^ is varied, and Aj is a measure of this. Since e[ = Y.^=i Cij^Oj - 9*) 

1 /2 

we can regard Sij = A- Cij as the contribution of the parameter 9j to varying 0- and thus 

(19) ^! = E4 



-2 

i=l 



can be regarded as a measure of the sensitivity of the system to 9j. It is sometimes 
appropriate to normalise this and instead consider 

(20) 7- = 

2.2. Robustness. Related to sensitivity, robustness in systems biology is usually under- 
stood as persistence of a system to perturbations to external conditions {24]. Sensitivity 
considers perturbation in a single parameter whereas robustness takes into account simul- 
taneous changes in all model parameters. Near to the maximum 0* the regions of high 
expected log-likelihood ^(0) > ^(0*) — e are approximately the ellipsoids NS{Q*,e) given 
by the equation 

(21) NS{e*,e) = {0 : (0 - 0*)^I(0*)(0 - 0*) < e} . 

The ellipsoids have principal directions given by eigenvectors C and equatorial radii (Ai)^2 . 
Sets NS are called neutral spaces as they describe regions of parameter space in which a 
system's behaviour does not undergo significant changes [TO] and arise naturally in the 
analysis of robustness. 

2.3. Confidence intervals and asymptotics. The asymptotic normality of maximum 
likelihood estimators implies that if 0* is a maximum likelihood estimator then the NS de- 
scribe confidence ellipsoids for with confidence levels corresponding to e. The equatorial 
radii decrease naturally with the square root of the sample size 
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2.4. Parameter identifiability and optimal experimental design. The FIM is of 

special significance for model analysis as it constitutes a classical criterion for parameter 
identifiability [15]. There exist various definitions of parameter identifiability and here we 
consider local identifiability. The parameter vector Q is said to be (locally) identifiable 
if there exists a neighbourhood of such that no other vector Q* in this neighbourhood 
gives raise to the same density as Q [15] . Formula (18) implies that Q is (structurally) 
identifiable if and only if FIM has a full rank |15j . Therefore the number of non-zero 
eigenvalues of FIM is equal to the number of identifiable parameters, or more precisely, to 
the number of identifiable linear combinations of parameters. 

The FIM is also a key tool to construct experiments in such a way that the parameters 
can be estimated from the resulting experimental data with the highest possible statistical 
quality. The theory of optimal experimental design uses various criteria to asses information 
content of experimental sampling methods; among the most popular are the concepts of 
D-optimality that maximises the determinant of FIM, and A-optimality that minimise the 
trace of the inverse of FIM Diagonal elements of the inverse of FIM constitute a 

lower-bound for the variance of any unbiased estimator of elements of G; this is known 
as the Cramer- Rao inequality (see SI). Finally, it is important to keep in mind that some 
parameters may be structurally identifiable, but not be identifiable in practice due to 
noise; these would correspond to small but non-zero eigenvalues of the FIM. Maximizing 
the number of eigen-values above some threshold which reflects experimental resolution, 
may therefore be a further criterion to optimize experimental design. But all of these 
criteria revolve around being able to evaluate the FIM. 



3. Results 

In order to demonstrate the applicability of the presented methodology for calculation 
of FIMs for stochastic models we consider two examples: a simple model of single gene 
expression, and a model of the p53 system. The simplicity of the first model allows us 
to explain how the differences between deterministic and stochastic versions of the model 
as well as TS and TP data arise. In the case of the p53 system model the informational 
content, as well as sensitivities and neutral spaces are compared between TS, TP and DT 
data. 

3.1. Single gene expression model. Although gene expression involves numerous bio- 
chemical reactions, the currently accepted consensus is to model it in terms of only three 
biochemical species (DNA, mRNA, and protein) and four reaction channels (transcription, 
mRNA degradation, translation, and protein degradation) (e.g. [26l|21]). Such a simple 
model has been used successfully in a variety of applications and can generate data with 
the same statistical behaviour as more complicated models [271 EE] • We assume that the 
process begins with the production of mRNA molecules (r) at rate kr- Each mRNA mole- 
cule may be independently translated into protein molecules {p) at rate kp. Both mRNA 
and protein molecules are degraded at rates jr and 7p, respectively. Therefore, we have 
the state vector x = {r,p), and reaction rates corresponding to transcription of mRNA, 
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translation, degradation of mRNA, and degradation of protein. 



(22) 



F{x) = {kr,kpr,'yrr,'ypp). 



Identifiability study. In a typical experiment only protein levels are measured [17^ |29] . It 
is not entirely clear a priori what parameters of gene expression can be inferred; it is also 
not obvious if and how the answer depends on the nature of the data (i.e. TS, TP or DT). 
We address these questions below. 

We assumed that the system has reached the unique steady state defined by the model and 
that only protein level is measured either as TS 



where the upper indices for TP measurements denote the number of trajectories from 
which the measurement have been taken to emphasise independence of measurements. 
Results of the analysis are presented in Table 2 SI. For TS data we have four identifiable 
parameters whereas time-point measurements provide enough information to estimate only 
two parameters. To some extent this makes intuitive sense: TS data contain information 
about mean, variance and autocorrelation functions, which can be very sensitive to changes 
in degradation rates; TP measurements reflect only information about mean and variance 
of protein levels therefore only two parameters are identifiable. On the other hand TP 
measurements provide independent samples which is refiected in lower Cramer-Rao bounds. 
Table 2 SI contains also a comparison with the corresponding deterministic model. As one 
might expect in the deterministic model only one parameter is identifiable as the mean is 
the only quantity that is described by the deterministic model, and parameter estimates 
are informed neither by variability nor by autocorrelation. 

Perturbation experiment. In order to demonstrate that identifiability is not a model specific 
but rather an experiment specific feature we performed a similar analysis as above for the 
same model with the same parameters but with the 5 fold increased initial mean and 25 fold 
increased initial variance. Results are presented in Table 3 SI. Some of the conclusions that 
can be made are hard to predict without calculating the FIM. The amount of information 
in TS data is now much larger than in TP data (higher determinant) and also CR bounds 
are now much lower for TP than for TS data. CR bounds for TS and TP are substantially 
lower than for the steady state data (except kr)- Interestingly, all four parameters can be 
inferred from TS and TP data, but not in the deterministic scenario. For steady state data 
all parameters could only be inferred from TS data (Table 3 SI ). 

Maximising the information content of experimental data. The amount of information 
in a sample does not depend solely on the type of data (TS, TP), but also on other 
factors that can be controlled in an experiment. One easily controllable quantity is the 
sampling frequency A. We consider here only equidistant sampling and keep number of 
measurements constant. Therefore we define A as time between subsequent observations 
A = tj+i — ti. To show how sampling frequency infiuences informational content of a 



(23) 




or as TP 



(24) 
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sample for the model of gene expression we used four parameter sets (Table 1 SI) and 
assumed that the data have the form (23). The amount of information in a sample was 
understood as the determinant of the FIM, equivalent to the product of the eigenvalues 
of the FIM. Results in Figure [T] demonstrate that our method can be used to determine 
optimal sampling frequency, given that at least some rough estimates of model parameters 
are known. It is worth noting that equidistant sampling is not always the best option and 
more complex strategies have been proposed in experimental design literature. 
Differences in sensitivity and robustness analysis in TS, TP and DT versions of the model. 
TS, TP and DT versions of the model differ when one considers information content of 
samples, and such discrepancies exist also when sensitivity and robustness are studied. 
First, deterministic models completely neglect variability in molecular species. Variability, 
however, is a function of parameters, and like the mean, is sensitive to them. Second, de- 
terministic models do not include correlations between molecular species. Third, temporal 
correlations are neglected in TP and DT models. 

To understand these effects we first analyse the analytical form of means, variances and 
correlations for this model (see SI). We start with the effect of incorporating variabil- 
ity. Suppose we consider a change in parameters, e.g. kp,'yp by a factor 5 {kp,^p) — )• 
{kp + 8kp,^p + S'jp). The means of RNA and protein concentrations are not affected by 
this perturbation, whereas the protein variance does change (see formulae (33-37) in SI). 
This result is related to the number of non-zero eigenvalues of the FIM. The FIM for the 
stationary distribution of this model with respect to parameters kp,jp has only one posi- 
tive eigenvalue for the deterministic model and two positive eigenvalues for the stochastic 
model. 

To study the effect of correlation between RNA and protein levels prp we first note that 
formulae (33 - 37) in SI demonstrate that at constant mean, correlation increases with 7^ 
when accompanied by a compensating increase in kp. Figure [2] (left column) presents neu- 
tral spaces (21) for parameter pairs for different values of correlation, p^p. The differences 
between DT and TS are enhanced by the correlation. 

Similar analysis reveals that taking account of the temporal correlations also changes the 
way the model responds to parameter perturbations. Figure [2] (right column) shows neutral 
spaces for three different sampling frequencies and indicates that the differences between 
stochastic and deterministic models decrease with A. 



3.2. Model of p53 system. The model of single gene expression is a linear model with 
only four parameters and a simple stationary state and illustrates how the methodology can 
be used to provide relevant conclusions and investigate discrepancies between sensitivities 
of TS, TP and DT models. Our methodology, however, can also be used to study more 
complex models, and here we have chosen the p53 signalling system, which incorporates a 
feedback loop between the tumour suppressor p53 and the oncogene Mdm2, and is involved 
in regulation of cell cycle and response to DNA damage. 

We use the model introduced in [30j that reduces the system to three molecular species, 
p53, mdm2 precursor and mdm2, denoted here by p, yo and y, respectively. The state of 
the system is therefore given by x = (p, yo,y), and the seven reactions present in the model 
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are described by the stoichiometry matrix 

(25) S = 
and occur at rates 

P 

F{x) = {K,axP,aky — —r^byp,aoyo,ayy), 
p + k 

so that the vector of model parameters can be written as 

= {hx, ax, ak, k, Uy, qq, ay). 
The above specification allow us to formulate a macroscopic rate equation model 

(26) = bx - ax(t)p - ak(l)y , , 

cpp + k 

(27) = by(t)p - ao(j)yo 

(28) <j)y = aO0S/O - CLyCpy. 

Informational content of TS and TP data for the p53 system. In the case of the single 
gene expression model we have argued that TS data are more informative due to account- 
ing for temporal correlations. On the other hand TP measurements provide statistically 
independent samples, which should increase informational content of the data. Therefore 
it is not entirely clear what data type is better for a particular parameter. If, for instance, 
a parameter is entirely informed by a system's mean behaviour than TP data will be more 
informative because TP data provide statistically independent samples about the mean. 
Whereas if a parameter is also informed by temporal correlations, then TS data will turn 
out to be more informative. It is difficult to predict a priori which effect will be dominating. 
Therefore calculation of FIM and comparison of their eigenvalues and diagonal elements is 
necessary. Eigenvalues and diagonal elements of FIMs calculated for parameters presented 
in Table 4 SI are plotted in Figure |3] and Figure |4| respectively. Eigenvalues of the FIM 
for TS data are larger than for TP data. Similarly, diagonal elements for all parameters 
are larger for TP than for TS data for most parameters difference is substantial. This 
indicates that temporal correlation is a sensitive feature of this system and provides signif- 
icant information about model parameters. The lower information content of the TP data 
can, however, be compensated for by increasing the number of independent measurements, 
which is easily achievable in current experimental settings (see Figure 7 SI). For determin- 
istic models the absolute value of elements of FIM depends on measurement error variance 
and therefore FIMs of TS and TP data can not be directly compared with the DT model. 
Sensitivity. The sensitivity coefficients Ti for TS, TP and DT data are presented in Figure 
[4} Despite differences outlined previously, here sensitivity coefficients are quite similar for 
all three types suggesting that the hierarchy of sensitive parameters is to a considerable 
degree independent on the type of data. The differences exist, however, in contributions 
Cfj (see Figure 6 in SI), suggesting discrepancies in neutral spaces and robustness analysis 
that we present below. 
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Neutral spaces. Comparison of the neutral spaces (21) for each pair of data types and for 
each pair of the parameters are given in Figures 3-5 SI. The conclusion we can draw from 
these figures is that NSs for TS, TP and DT model exhibit substantial differences; these 
differences, however, are limited to certain parameter pairs. Differences between NPs of 
TS and DT models are exhibited in pairs involving parameters bx, ay] between TS and TP 
in pairs involving b^; and between TP and DT also pairs involving bx- 
This suggests that parameter bx is responsible either for the variability in molecular num- 
bers or the correlation between species, as these are responsible for differences between 
TP and DT models. Similarly the lack of differences in pairs involving Uy in comparisons 
of TP and DT, and their presence in comparison of TP and TS indicates that parameter 
ay is responsible for regulating the temporal correlations. This analysis agrees with what 
one might intuitively predict. Parameter bx describes the production rate, and therefore 
the mean expression level of p53, and also the variability of all components of the system. 
It is difficult, however, to say how this parameter influences correlations between species. 
Parameter ay, on the other hand, is the degradation rate of mdm2 and therefore clearly 
determines the temporal correlation of not only mdm2 but also of p53, because mdm2 
regulates the degradation rate of p53. While heuristic, our analysis of the neutral spaces 
nevertheless clearly demonstrates the differences between the three types of models and 
creates a theoretical framework for investigating the role of parameters in the stochastic 
chemical kinetics systems and without the need to perform Monte Carlo sampling or other 
computationally expensive schemes. 



4. Discussion 

The aim of this paper was to introduce a novel theoretical framework that allows us to 
gain insights into sensitivity and robustness of stochastic reaction systems through analysis 
of the FIM. We have used the linear noise approximation |3 HI16[[T7] to model means, vari- 
ances and correlations in terms of appropriate ODEs. Differentiating the solution of these 
ODEs with respect to parameters |23j allowed us to numerically calculate derivatives of 
means, variances and correlations, which combined with the normal distribution of model 
variables implied by the LNA gave us the representation of the FIM in terms of solutions of 
ODEs. To our knowledge this is the first method to compute FIM for stochastic chemical 
kinetics models without the need for Monte Carlo simulations. 

Given the role of the FIM in model analysis and increasing interest in stochastic mod- 
els of biochemical reactions, our approach is widely applicable. It is primarily aimed at 
optimising or guiding experimental design, and here we have shown how it can be used 
to test parameter identifiability for different data types, determine optimal sampling fre- 
quencies, examine information content of experimental samples and calculate Cramer- Rao 
bounds for kinetic parameter estimates. Its applicability, however, extends much further: 
it can provide a rationale as to which variables should be measured experimentally, or 
what perturbation should be applied to a system in order to obtain relevant information 
about parameters of interest. Similar strategies can also be employed in order to optimise 
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model selection procedures. As demonstrated here stochastic data incorporating informa- 
tion about noise structure are more informative and therefore experimental optimisation 
for stochastic models models may be advantageous over similar methods for deterministic 
models. 

A second topical application area is the study of robustness of stochastic systems. Interest 
in robustnesses results from the observation that biochemical systems exhibit surprising 
stability in function under various environmental conditions. For deterministic models this 
phenomenon has been partly explained by the existence of regions in parameter space (neu- 
tral spaces) |10j . in which perturbations to parameters do not result in significant changes 
in system output. We have demonstrated that even a very simple stochastic linear model 
of gene expression exhibits substantial differences when its neutral spaces are compared 
with the deterministic counterpart. Therefore a stochastic system may respond differently 
to changes in external conditions than the corresponding deterministic model. Our study 
presents examples of changes in parameters that do not affect behaviour of a deterministic 
systems but may substantially change a probability distribution that defines the behaviour 
of the corresponding stochastic system. Thus for systems in which stochasticity plays an 
important role random effects can not be neglected when considering issues related to ro- 
bustness. 

Acknowledgments MK and MPHS acknowledge support from the BBSRC (BB/G020434/1). 
DAR holds an EPSRC Senior Fellowship (GR/S29256/01), and his work and that of MC 
were funded by a BBSRC/EPSRC SABR grant (BB/F005261/1, ROBuST project). DAR 
and MK were also supported by the European Union (BIOSIM Network Contract 005137). 
MPHS is a Royal Society Wolfson Research Merit Award holder. 




0.2 0.4 O.e O.B 1 1.2 1.4 1.6 l.B 2 



Figure 1 . Determinant of FIM plotted against sampling frequency A (in 
hours). We used logarithms of four parameter sets (see Table 1 SI). Sets 
1 and 3 correspond to slow protein degradation (7^ = 0.7); and Sets 2 
and 4 describe fast protein degradation (7^ = 1.2). We assumed that 50 
measurements {n = 50) of protein levels were taken from the stationary 
state. Observed maximum in information content results from the balance 
between independence and correlation of measurements. 
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Figure 2. Neutral spaces for TS and DT versions of the model of single 
gene expression for logs of parameters k^. and 7p. Left: Differences resulting 
from RNA, protein correlation: p^p = 0.1 (top) prp = 0.5 (middle), prp = 0.9 
(bottom). Correlation 0.5 corresponds to parameter set 3 from Table 1 SI 
and was varied by equal-scaling of parameters kp,^p. Right: Differences 
resulting from temporal correlations. We assumed n = 50 and tuned corre- 
lation between observation by changing sampling frequency A = 0.3h (left) 
A = 3h (middle) A = 30h (right). Set 3 of parameters was used (Table 1 
SI). 



Figure 3. Eigenvalues of FIM for p53 model for three data types: time 
series (blue), time-points (green) and deterministic model (red). Eigen val- 
ues were normalised against maximal eigenvalue for each data type (top) 
and against maximal eigenvalue among all three types (bottom). FIM was 
calculated for logs of parameters from Table 4 SI . 
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Figure 4. Left: Diagonal elements of FIM for TS and TP versions of 
p53 model. Values of FIM for DT verison are not presented as they can 
not be compared with those for stochastic models. Right: Sensitivity 
coefficients Ti for TS, TP, DT version of p53 model. FIMs were calculated 
for parameters presented in Table 4 SI. 



TS (heatmap) and DT (contour plot) TS (heatmap) and TP (contour plot) 




-0.1 0.1 -0 1 0.1 

Sloglbp 6log(b_) 



Figure 5. Neutral spaces for TS, TP, and DT versions of p53 model 
for logs of two parameter pairs (00,0^) and {bx,ay). Left column presents 
differences resulting form general variability, correlations between species 
and temporal correlation (comparison of TS and TP models). Right col- 
umn shows differences due to variability and correlation between species 
(comparison of TS and TP models) . Top row is an example of parameters 
for which differences are negligible, bottom row presents a parameter pair 
with substantial differences. FIM was calculated for 30 measurements of all 
model variables and A = Ih. 
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Sensitivity, robustness and identifiability in stochastic chemical 

kinetics models 
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1. Centre for Bioinformatics, Imperial College London, UK 
2. Systems Biology Centre, and Mathematics Institute, University of Warwick, UK 

This is supplementary information for the paper Sensitivity, robustness and identifiability in 
stochastic chemical kinetics models which is henceforth referred to as MP. 



We consider a general system of chemical reactions that consists of chemical species 
and interacts in a fixed volume through R reactions. Let x = (xi, . . . iX^Y be the vector 
representing the numbers of molecules for the N species and S = {sij}i=i,2...Af; 3=1,2. ..r be 
the stoichiometry matrix that describes changes in the population sizes due to each of the 
reactions, so that occurrence of reaction j results in a change 



We assume that the probability that a reaction of type j occurs in the time interval [t, t + rft) 
equals fj{x, Q)dt, where functions fj{x, 0) are called the reaction rates and = (^^i, 0^) is 
the vector of all model parameters. The probability that more than one event will take place 
in a small time interval is of higher order [dt'^) with respect to the length of the interval and 
can thus be ignored. Finally, we assume that events taking place in disjoint time intervals 
are independent, when conditioned on the events in the previous interval. This specification 
leads to a Poisson birth and death process; the Chemical Master Equation [1, 2] is widely 
used to describe the temporal evolution of the probability P{x,t) that the system is in the 
state X at time t 



1 Derivation of the model equations 



(Xi, ....,Xn) {Xi + Sij, ...,Xn + SNj). 



dP{x,t) 
Jt 



R 



X - j, t)fj{x - s. j, e, t) - p{x, t)fj{x, e, t)) . 



(1) 
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Under the assumption that molecular species are present in sufficiently large copy numbers 
the model defined above is well described by the following system of equations [1, 3, 4] 



x{t) = ip{t)+m (2) 

^ = SF(^,e,t) (3) 

dc = A{if,e,t)c + E{if,e,t)dw, (4) 



where 



{A{^,e,t)u = f2^^j^ (6) 



F{<p, e, t) = e, t), fnicp, e, t)) (5) 

Ei^, 0, t) = S,/dtag{F{^,e,t)). (7) 

Equation (3) is an ordinary differential equation that in general does not have an explicit 
solution but can be solved numerically, whereas equation (4) is a linear stochastic differential 
equation that has a solution of the form [5] : 

m = Hto, ^to + f ^{s, t)E(^, e, s)dW(s), (8) 

J to 

where the integral is in the Ito sense and ^{to, s) is the fundamental matrix of the non- 
autonomous system of ODEs 

^5^ = /l(^,e,s)$(to,s), $(to,to) = /. (9) 

In order to simplify the further analysis of the system studied, suppose that the initial 
condition has a multivariate normal distribution (MVN) a;(0) ~ MVN{lp{0),V{0)). 

This specification of an initial condition together with equations (2 - 4, 8) implies that 
x{t) has a multivariate normal distribution [5, 6] 

x{t) ~ MVN{ip{t), V{t)) t > 0, (10) 

where (p{t) is a solution of the macroscopic rate eqaution (MRE), Eqn. (3), with initial 
condition (f{0), and V{t) is a variance at time t. Direct calculations using equations (2 - 4, 
8) show that V satisfies 

dV{t) ^ ,,,,,,, , ,,,,, ^ , z..,, o .^z..,, o .^T 



dt 



A{cp, e, t)v{t) + v{t)A{cp, e, ty + E{cp, e, t)E{^, e, ty , (ii) 
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which is equivalent to the fluctuation dissipation tlicorcm [1]. In the further sections we 
wiU also need to specify covariances, cov(x(s), (t > s), and therefore we calculate these 
here. As {x{t)) = (p{t) we have that cov(a;(s), = {^{t)^{s)'^) and therefore equation (8) 
implies 

coY{x{s),x{t)) = V{s)^{s, tf. (12) 

2 Derivation of the likelihood function 

In the previous section we have explained that x(t) ~ MVN((p(t), V{t)). Now we derive the 
distribution of experimental data. Three different data types are considered: time series, 
time-point measurements, and deterministic model data. 

2.1 Time series data 

We start with the case where a single trajectory is measured at times ti, t„. Initially sup- 
pose that all molecular species Xi are measured. Later we demonstrate that this assumption 
is easily relaxed. First let xts = {xtn ■ ■ ■ :Xt^) be a nN column vector that contains all 
measurements and (f{(po, ©, t) be a solution of equation (3) such that (f{(po, ©, 0) = </?o, and 
let V(yo,Q,t) be a solution of equation (11 ) such that y(Vo,0,O) = Vq. In order to find 
the distribution of vector xyg we write XtQ = <p{to) +Qo> where ~ MVN{0, Vq) and using 
equations (3-4) and (8) we have 

xti = Vti + Hh, ti)qto + 
where ^t, ~ (0,Si) and Si = J^^ ^{s,ti)^E{s)^E{s)^s,ti)ds. Using 

$(t,_i, t,+i) = $(t,-, t,+i)$(t,-i, tj) (13) 
we can analogously write x^. as 

i 

where are independently normally distributed random variables with mean and covari- 
ance matrix = J^*^ ^ ^{s,tj)-'" E{s)'^ E{s)^{s,tj)ds. This representation demonstrates that 
Xt- is a linear sum of multivariate normal variables and therefore xt5 has a multivariate 
normal distribution with mean /i(6) and covariance matrix J^Tsi^) 

^Ts^MVN{fi{e),ETs{e)) (15) 
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where fi{Q) = {(fi{ti), ...,0{tn)) and St5(6) is a {n x N) x {n x N) block matrix St5(0) = 
|E(0)(*'-^^}^_^ AT-j-i AT ^^^^ that diagonal elements contain variances S(0)*^*'*) = V^(ti) and 
non-diagonal elements (i ^ j) covariances S(O)*^*'''^ = cov (x {t, i), x{tj)). Diagonal elements 
are given by the solution V. From representation (14) we have 

^'''^' = ^'''Ht„t,+,f, (16) 

which demonstrates that non-diagonal elements can be easily computed from diagonal ele- 
ments given by solutions of equation (11). 

2.2 Time-point measurements 

Here we consider the case where in an experiment at each time point ti,...,t„ a different 
trajectory is measured. Therefore, measurements come from the same process x(t) but from 
its independent realisations. We define the measurement vector as x^p = {xll\ . . . 
Upper indices indicate the number of trajectories from which the measurements were taken 
in order to emphasis that each measurement is taken from a different trajectory. The distri- 
bution of x[^^ is given by (10). All measurements are independent so that cov(xt., x^.) — 
for i ^ j, therefore 

xtp ~ MVN{i2{&), Etp(0)) (17) 

where /x(0) = (<^(ti), <^(t„)) and Stp(0) has the same diagonal blocks as Ers'(0) and 
non-diagaonal blocks are equal to 

Stp(0)(^'^) = I ^IJ^^ '=^- (18) 
I for z 7^ J. 

2.3 Deterministic model 

In order to study differences between stochastic and deterministic regimes we also consider 
a deterministic model where the system state is described entirely by the MRE (3) . In such 
a model measurements are usually assumed to have the form 

x{ti) = (p{ti) + eti, 

, where . is a normally and independently distributed measurement error with mean and 
constant variance a^. We denote the measurements for this model by = {xti, ■ ■ ■ ,Xt„)- 
Finding the data distribution for this case is straightforward, 

Kdt MVN{ii{Q),Edt), (19) 

where //(0) is as in the previous cases and T^dt is a N'^n^ diagonal matrix with diagonal 
elements equal to cr^. 
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2.4 Hidden variables 



Usually it is not possible to measure all variables present in the system of interest experi- 
mentally. Hence we here demonstrate that the distribution of observed components can be 
directly extracted from the distributions (15, 17, 19). For simplicity we consider the case of 
time series data only; analysis for the two other data types proceeds analogously. First, we 
partition the process x{t) into those components y{t) that are observed and those z{t) that 
are unobserved. Let y-r^ = {yti, ■ ■ ■ ,ytn) ^ts = {^ti, ■ ■ ■ T^t„) denote the time series 
that of y and respectively, at times ti, . . . tn- 

The distribution of yrs is a marginal distribution of xts', we thus have 

Yts MVNinyie),tie)), (20) 

where iJ>y{Q) and S(0) are elements of /x(0) and Sts(0) that correspond to the observed 
components of Xts- If for instance first M out of components of x are observed than 
y{t) = {xi{t),...,XM{t)), and 

/iy(6) = ((^Af (tl), (pM{tn)) (21) 

where (fM{t)-=^ (M't), 0m(^)) and E(e)= is a MN x MN block matrix 

E(e) = {E(e)(''^n. (22) 

where 

E(^^)(e) = E(y)(e) p=l,...,M,q = 1,...,M. (23) 



3 Calculation of the Fisher Information Matrix (FIM) 



Suppose that a random variable X has an A'"-variate normal distribution with mean /x(9) = 
(/ii(6), ...,/iAr(9))^ and covariance matrix E(©). We define the FIM for this variable to be 

/(e)= {/(e),,} [7] 



I{e)k,i = Ee 



d_ 



iog(^(x,e)) 



d_ 
89, 



iog(^(x,e)) 



(24) 



where ■0(.) is the density function of a multivariate normal distribution with mean /i(©) and 
covariance E(6). As the random variable X is normally distributed the elements I{Q)k,i can 
be also expressed explicitly as 



da^^.^.da 1 ,^ i <9E ^ i<9E, 



(25) 
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In this section we demonstrate how to calculate the FIM for the models (2- 4). We consider 
the model for time series data, as the case of time-point measurements is less general and 
can be directly extracted from formulas derived below. Previously, we have shown 15 that 
variable x^s has a multivariate normal distribution and demonstrated how its mean 
and covariance matrix E(©) can be calculated. Formula (25) indicates that two more com- 
ponents ^ and ^flp need to be known in order to be able to compute FIM. Below we show 
how these can be obtained. 

Let Y{t) be the concatenated vector of ip{t) and upper diagonal of the symmetric matrix 

V 

Y{t) = (Mt), Mt), Vi,,{t), Vj,,N{t), Vr,2{t), nr-i,iv(i)) (26) 

and Y{Yq, 0, t) be the concatenation of <f{(po, ©, t) and upper diagonal of V{Vo, 9, t). Simi- 
larly denoting the concatenation of the right hand sides of equations (3) and (11) by 1^ we 
can write 

jY{t)^W{Y{t),e,t). (27) 

To determine the derivative Zk{t) = we use the fact that it satisfies the following equation 
(see Appendix) 

^Zfc(t) = J{Y{t),Q,t)Zk{t) + Kk{t), (28) 

where J{Y{t),Q,t) is the Jacobian g^^W{Y{t),Q,t) evaluated at the solution Y{t) and 
Kk{t) is the vector ^W{Y{t),Q,t) also evaluated at Y{t). 

The solution of equation (28), Z{t), provides us with and therefore with Similarly 
Z{t) contains diagonal elements of 

Non- diagonal elements of ^ can be computed from diagonal elements using the recursive 
relation 



(29) 

from elements <l>(tj,tj+i), 2*^*'*^ tJ^E^*'*^ that are given by equations (9), (16) and (28) 
respectively. To simplify notation denote Sfc(s,t) = As $(s,t) is a solution of an 

ODE we use similar techniques as in equations (28) and write Sfc(s,t) as a solution of the 
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differential equation 

^{s, t) = A{^, e, t)E{s, t) + Mk{t), (30) 

wliere 

^'^^^ = ^(^(<?,e,t)$(s,t)) = (J;^(<^'e>'i) + (^^(<^'e>'i)^=^)^) "^M^ (3i) 

and H(s, s) = for all s. To summarise, for the experimental data distribution (15) the FIM 
(25) can be computed using equations (28 - 31): 

• the parameter derivative of the mean, ^'q^^ , can be extracted from a solution of (28) 

• diagonal elements of the parameter derivatives of the variance, ^'^gg^^^ , can be ex- 
tracted from a solution of (28) 

• non-diagonal elements of parameter derivatives of the variance, ^^gg^^^ , are given by 
formula (29), which involves (30) and (31). 

3.1 Summary of the numerical computation of the FIM 

Below we summarise in more details how the FIM can be calculated numerically. We start 
with the case of time series data as it is most general and the remaining two can be derived 
from it. 

Time series measurmens 



1 Read input: Stoichiometry matrix S, reaction rates vector F, 
initial conditions Xq, Vq 

2 Construct equations for (p (eq. (3)) and V (eq. (11)) and for Y (eq. (27)) 

3 Calculate symbolically the Jacobians A (eq. (6)), J (eq. (28)) and vectors 
Kk (eq. (28)) , Mk (eq. 30) (A; = 1,...,L) 

4 Solve equations for (p and V (eq. (27)) 

5 Compute fundamental matrices $(ti,tj+i) i = l,...,A^ — 1 (eq. (9)) 

6 Construct covariance matrix T^ts from V{ti) and ^{ti,ti+i) {i — l,...,n) according 
to eq. (16) 
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7 Compute ^ (solve eq. (28)) and extract ^ ^ according to eq. (26) 

8 Compute ^<^{ti,ti+i) (eq. (30)) 

9 Compute J^S^'"'^ for j ^>i + l,...,n and i^l,...,n (eq. (29)) 

10 Construct -^T,ts from objects computed in steps 7 and 9 

11 From ■^'^ > Sts> -^'^ts extract those elements corresponding to observed component 
according to relations (21) , (22) and (23) 

12 Compute FIM from elements obtained in the previous steps according to eq. 
(25) 

Time-point measurmens 

In order to compute the FIM for time-point measurements the covariance matrix, E^s, 
should be replaced by S^p. Additionally non-diagonal blocks of covariance matrix, S^p, 
are equal to 0, therefore steps 5, 8 and 9 are omitted and in step 3 vectors Mj need not be 
computed. 



Deterministic model data 

For the deterministic model the covariance matrix does not depend on parameters, therefore 
the formula for the elements of FIM simplifies to 

/(e),, ^ ^'^..(8)^, (32) 
and it requires only calculation of derivatives ^ for A; = 1, . . . , L. 
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4 Examples 

In this section we present details pertaining to the examples of models of single gene expres- 
sion and the p53 system. 

4.1 A model of a single gene expression 

The Table 1 contains parameter values used for numerical experiments presented in the main 
paper. 



Par am. 


Set 1 


Set 2 


Set 3 


Set 4 




100 


100 


20 


20 


kp 


2 


2 


10 


10 


Ir 


1.2 


0.7 


1.2 


0.7 


7p 


0.7 


1.2 


0.7 


1.2 



Table 1: Four parameter sets used in analysis of the single gene expression model. Sets 1 and 
3 correspond to slow protein degradation rate 7p and high and low transcription / translation 
ratio, respectively. On the other hand Sets 2 and 4 describe fast protein degradation rate 
and high and low transcription / translation ratio, respectively. All rates are per hour. 



4.1.1 Differences in sensitivity and robustness analysis between time-series, 
time-points and deterministic versions of the model 

Considering sensitivity and robustness analysis, there are three main differences between 
stochastic and deterministic systems. Firstly deterministic models completely neglect vari- 
ability in the abundances of molecular species. This variability is a function of the kinetic 
parameters and is therefore also sensitive to them. Secondly, the deterministic model does 



9 



Type 


# ident. par am. 


CRilogiK)) 


CR{log{k,))) 


Ci?(log(7.)) 


Ci?(log(7p)) 


det( FIM ) 


TS 


4 


0.0017 


0.0016 


0.0017 


0.0017 


4 ■ 103 


TP 


2 


0.0002 


0.0002 


0.0002 


0.0002 





DT 


1 


3 • 10-3 


3 • 10-3 


3 • 10-3 


3 ■ 10-3 






Table 2: Identifiability analysis for stationary state data. The table presents the number 
of non-zero eigenvalues ident. param.), Cramer- Rao bounds (CR), determinants of FIM 
(det(FIM)) for different data types (time series (TS), time point measurements (TP), de- 
terministic model (DT)). The number of non-zero eigenvalues equals to the number of (in 
principle) identifiable linear combinations of parameters and therefore describes the number 
of parameters that can be estimated given that others arc known. Quantities were calculated 
for parameter set 3 (see Table 1) and we have set the sampling frequency to A = 0.3h, and 
the number of measurements to n = 50. The system was supposed to be in the stationary 
state. We have assumed that a parameter is identifiable if an eigenvalue of FIM is not lower 
than 10^"* to take account of numerical inaccuracies, and therefore ident. param." is 
calculated as the number of eigenvalues that are greater or equal to 0.1% of the largest eigen- 
value. For the same reason the determinant was calculated as the product of eigenvalues that 
satisfies this condition. As not all parameters were identifiable in all versions of the model 
we calculated CR for individual estimates (assuming all other parameters to be known). For 
the deterministic model we have set variance of measurement error o"^ = 100 and no mea- 
surement error for TS and TP therefore CR-bounds between stochastic and deterministic 
models cannot be compared. 



not include correlations between molecular species. Thirdly, temporal correlations are also 

neglected. 

In the main paper we argued that these three factors can have a significant impact on how 
stochastic and deterministic systems respond to perturbations in parameters. Here we pro- 
vide further explanation using the model of single gene expression. The formulae for mean. 
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Type 


# ident. par am. 


CR{\og{K)) 


CR{\og{kp)) 


CR{log{^r)) 


Ci?(log(7p)) 


det(FIM) 


TS 
TP 
DT 


4 
4 
3 


0.0413 
0.0185 
0.47-10-4 


0.0112 
0.0036 
0.04 • 10-4 


0.0098 
0.0036 
0.07-10-4 


0.0072 
0.0020 
0.02 • 10-4 


6.96 ■ 104 
6.94 ■ 10^ 




Table 3: Identifiability analysis for perturbation experiment. Identical analysis as in Table 
2 but with an initial mean increased 5 fold and the initial variance 25 fold. 



variances and covariance for this model are 

(r) = ^ (33) 

(p) = ^ (34) 

(Sr^) = ^ (35) 

{Sp') = (P)(l + -^) (36) 

{SrSp) = ^^^'^ ■ (37) 

Kir ~l~ Tpj Tr- 

In order to understand the effect of incorporating variability into the sensitivity analysis we 
are considering changes in parameters, e.g. kp, 7^, by a factor S {hp, jp) {kp + Skp, Jp + S^p). 
Means of RNA and protein concentrations are not affected by this perturbation, whereas pro- 
tein variance is. 



To understand the effect of correlation between RNA and protein levels we note that 
formulae (33 - 37) demonstrate that at constant mean, correlation increases with 7^ and 
compensating decrease in kp. Figure 1 presents neutral spaces for all parameter pairs for dif- 
ferent values of correlation Prr) = —A^lM^. Differences between deterministic and stochastic 

model increase with correlation. 

We also perform similar analyses for different levels of temporal correlation between 
observations by varying the sampling frequency A. Figure 2 presents neutral spaces for all 
parameter pairs for three different sampling frequencies. The differences between stochastic 
and deterministic models decrease with A as the samples are less correlated for high A, and 
therefore the factor that distinguishes two models becomes less significant. 
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4.2 P53 system 



In the study of the 53 system we have used parameter estimates presented in Table 4. These 
parameters has been obtained by appropriate scahng of the parameters given in [8] . For all 
numerical experiments for p53 model we assumed sampling frequency A — Ih and number 
of measurements n = 30. 



Par am. 


Value 


P. 


90 




0.002 


Oik 


1.7 


k 


0.01 




1.1 


ao 


0.8 


ay 


0.8 



Table 4: Parameters of p53 system. 



4.2.1 Sloppiness analysis 

Here we compare neutral spaces for all pairs of parameters of the P53 model for tree 
data types (TS, TP, DT). Wc use parameter values presented in Table 4 and logarithmic 
parametrisation. Results are presented in Figues 3, 4 and 5. 



4.3 Dependance of analysis on parameter values and qualitative 
model behaviour 

Our method allows to study model sensitivity given the parameter values. Here we show 
that results depend on parameter values just as the dynamical behaviour of the system does; 
i.e. the sloppiness of a system also depends on the parameters and is not a fixed property 
of a mathematical model. Figure 8 demonstrates that p53 undergoes a Hopf bifurcation as 
parameter ay is varied from 0.8 to 2, while all other parameters remain unchanged. Param- 
eters thus determine the qualitative dynamical behaviour and therefore varying parameters 
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influences the structure of the FIM (compare Figures 3 and 9). Change in the FIM in turn 
has consequence for the magnitude of eigenvalues (Figure 8), sensitivity coefficients (compare 
Figures 4 in MP and 11) and informational content of TS and TP data (Figures 7 and 12). 

5 Appendix 

5.1 Calculating derivatives of a solution of ODE 

In order to derive equations (28) and (30) we difTcrcntiatcd the solution of an ODE with 
respect to a parameter. Here we provide more details about this procedure. Suppose that 
the differential equation being considered is 

X — F{x, 6, t), 

where x e and the set of parameters are collected together into a parameter vector 
G R^. Suppose that x{Q,xo,t) is the solution of interest with an initial condition xq — 
x{e,Xo,Q). For Y{t) = ^^^§f^ it can be shown [9] that 

Y^J{t)Y{t) + Kk{t) (38) 

where J(t) is the Jacobian of F with respect to x evaluated at x, Kk{t) is the n-dimensional 
vector g and F(0) = 0. 

5.2 Logarithmic parametrisation 

In the analysis of examples presented in our study we used a logarithmic parametrisation. 
Below we provide a rationale for this and explain that the FIM for logarithmic parametri- 
sation can be directly obtained from derivatives calculated to obtain the FIM for original 
parameters. 

In biochemical systems, the values of two parameters may differ by orders of magnitudes. 
Therefore, it is usually not appropriate to consider the absolute changes in the parameters 
9k, but instead to consider the relative changes. A good way to do this is to introduce new 
parameters rik = log(6'fc), because absolute changes in r)k correspond to relative changes in 
9k- Then, for small changes S9k to the parameters 9k, the changes rjk are scaled and non- 
dimensional. Analyses in terms of absolute and relative changes arc closely related and do 
not require additional computational cost. For any differentiable function f{9) 

dlog{9) dri 09' ^ ^ 
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and therefore any element of FIM for rj can be easily converted into that for 6 or vice versa 



drik 



T 



d(i 1 



-1 



) 




T 



9E 



-1 



) 



5.3 The FIM as a measure of system's sensitivity 

Here we provide an alternative explanation why the FIM provides a measure of sensitivity for 
a stochastic system. For notational simplicity we assume that the studied system depends 
on a single parameter 9 as generalisation for multidimensional parameter is straightforward. 
We start with definitions of classical sensitivity coefficients. 

5.3.1 Classical sensitivity coefRcient 

The classical sensitivity coefficient for observable Q and parameter 9 is defined as [10] 



Often sensitivity of relative changes needs to be considered. Given that Alog((5) ~ 
the formula for sensitivity of relative changes takes the form 



5.3.2 The FIM as a sensitivity measure for a stochastic system 

The behaviour of a stochastic system is not defined by an observable Q that can be measured 
experimentally in a reproducible way. It is instead defined by a distribution form which the 
measurements are taken. Suppose we want to construct a measure of a sensitivity of a 
distribution of a random variable X with density ip. Assume we want to examine relative 
changes of the distribution ip to changes in 9. This can be written as 



dQ 
89 ' 



d\og{Q) 
89 




log( 



ij{X,9 + d9) 
i^{X,9) 
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The above is the negative Kullback-Leibler divergence between distributions il^{X, 9 + 89) 
and ip{X^9). In order to study changes in ip resulting from "small" changes in 9 we divide 
the above equation by 89 and take the limit 89 ^ Q and get 

The above quantity is the average of the score function and it is basic fact of mathematical 
statistics that it equals to zero [7]. This observation suggests that it is better to study the 
squared differences 

/ (log(^(X, 9) - \og{ij{X, 9 + 89)f^P{X, 9)dX, 
Jx 

that lead to 



{\og{^l^{X,9)-\og{iJ{X,9 + ^9)f . [ ( ^ . f {ij{X,9 + 89)) \y 

(89^ ^ Jx [d9 [ m,0) ) ) 



which is precisely the definition of the FIM. 

The above derivation suggests that the FIM is a good measure of sensitivity of a proba- 
bility distribution and that there is a close link between Kulback-Leibler divergence and 
the FIM. The KL divergence measures the average relative difference between two distri- 
butions whereas FIM measures squared relative difference between a distribution and the 
same distribution with a perturbed parameter relative to the infinitesimal size of the squared 
perturbation. 
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Figure 1: FIM for a single measurement from stationary distribution of the model of single 
gene expression. For p^p = 0.1 (top), prp = 0.5 (middle), prp = 0.9 (bottom). Correlation 0.5 
was obtained using parameter set 3 from Table 1. Correlation was varied by equal-scaling 
of parameters fcp, 7p. 
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A=0.3 correlation =0.5 




Figure 2: FIM for a 20 times-series type measurements from the stationary distribution of 
the model of single gene expression for three different sampling frequencies: A = 0.3 (top) 
A = 3 (middle) A = 30 (bottom). Parameter set 3 from Table 1 was used. 
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Figure 3: Neutral space for time-series (heatmap) and deterministic (contour plot) versions 
of the p53 model. The FIM was calculated for the logarithms of parameters in Table 4. 
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Figure 4: Neutral space for time-series (heatmap) and time-point (contour plot) versions of 
the p53 model. The FIM was calculated for logarithms of parameters in Table 4. 
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Figure 5: Neutral space for time-points (heatmap) and deterministic (contour plot) versions 
of the p53 model. The FIM was calculated for logarithms of parameters in Table 4. 
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Figure 6: Sensitivity matrices Cf^ for p53 model for three data types (TS, TP, DT) calculated 
using parameters presented in Table 4. 
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Number of TP measurements per time-point 



Figure 7: Comparison of informational content of TP and TS samples for p53 model. Deter- 
minant of the FIM for TP data is plotted against number of measurements per time point 
(black line). Due to independence of measurements we observe the linear increase. For com- 
parison determinant of the FIM for a single TS sample is also depicted (red dashed line). 
Intersection of the two lines indicates how many TP measurements are necessary to obtain 
the same amount of information in a single trajectory (TS data). In this case around 6.5-10^ 
measurements are needed. 




Figure 8: Trajectories of the p53 system plotted together with the standard deviations 
bars. In the left panel parameters from Table 4 were used and in the right panel the same 
parameters except ay = 2. The system undergoes a Hopf bifurcation and moves from 
oscillatory behaviour to dynamics with a stable stationary state. 
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8log(b^) 8log(a^) 8log(aj) « log(k) 8log(bp 8log(a^) 8log(a^) 

Figure 9: Neutral spaces for time-points (heatmap) and deterministic (contour plot) versions 
of the p53 model. FIM was calculated for logarithmss of parameters in Table 4 except 
ay = 2. Differences compared with Figure 5 demonstrate the dependance of neutral spaces 
on parameter values and the qualitative dynamics of the system. 
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Eigen values normalized against model maximum 




1 2 3 4 5 6 7 



Figure 10: Eigenvalues of FIM for p53 model for three data types: time series (blue), time- 
points (green) and deterministic model (red). Eigenvalues were normalised against maximal 
eigenvalue for each data type (top) and against maximal eigenvalue among all three types 
(bottom). FIM was calculated for logs of parameters from Table 4 except ay = 2. Figure 
demonstrates that the behaviour of the eigenvalues depends on parameter values (compare 
with Figure 3 in the MP). 
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Diagonal elements of FIM for p53 system Overall sensitivities 




b_x a_x a_k k b_y a_0 a_y b_x a_x a_k k b_y a_0 a_y 

parameters 



Figure 11: Left: Diagonal elements of FIM for TS and TP versions of p53 model. Right: 
Sensitivity coefficients % for TS, TP, DT version of p53 model. FIMs were calculated for 
parameters presented in Table 4 except ay = 2. 




01 23456789 10 

Number of TP measurements per time-point x 1 0^ 



Figure 12: Comparison of informational content of TP and TS samples for p53 model simi- 
larly as Figure 7 but with ay = 2 instead of ay = 0.8. 
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